The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English

نویسندگان

Branimir Boguraev

Ted Briscoe

John A. Carroll

David M. Carter

Claire Grover

چکیده

We describe a methodology and associated software system for the construction of a large lexicon from an existing machine-readable (published) dictionary. The lexicon serves as a component of an English morphological and syntactic analyesr and contains entries with grammatical definitions compatible with the word and sentence grammar employed by the analyser. We describe a software system with two integrated components. One of these is capable of extracting syntactically rich, theory-neutral lexical templates from a suitable machine-readabh source. The second supports interactive and semi-automatic generation and testing of target lexical entries in order to derive a sizeable, accurate and consistent lexicon from the source dictionary which contains partial (and occasionally inaccurate) information. Finally, we evaluate the utility of the Longman Dictionary of Contemporary EnglgsA as a suitable source dictionary for the target lexicon.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE

This article focusses on the derivation of large lexicons for natural language processing. We describe the development of a dictionary support environment linking a restructured version of the Longman Dictionary of Contemporary English to natural language processing systems. The process of restructuring the information in the machine readable version of the dictionary is discussed. The Longman ...

متن کامل

Acquisition Of Computational-Semantic Lexicons From Machine Readable Lexical Resources

This paper describes a heuristic algorithm capable of automatically assigning a label to each of the senses in a machine readable dictionary (MRD) for the purpose of acquiring a computational-semantic lexicon for treatment of lexical ambiguity. Including these labels in the MRD-based lexical database offers several positive effects. The labels can be used as a coarser sense division so unnecess...

متن کامل

A Class-based Approach to Word Alignment

This paper presents an algorithm capable of identifying the translation for each word in a bilingual corpus. Previously proposed methods rely heavily on word-based statistics. Under a word-based approach, frequent words with a consistent translation can be aligned at a high rate of precision. However, words that are less frequent or exhibit diverse translations generally do not have statistical...

متن کامل

Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation

This paper describes a new approach to word sense disambiguation (WSD) based on automatically acquired "word sense division. The semantically related sense entries in a bilingual dictionary are arranged in clusters using a heuristic labeling algorithm to provide a more complete and appropriate sense division for WSD. Multiple translations of senses serve as outside information for automatic tag...

متن کامل

Class Based Sense Definition Model for Word Sense Tagging and Disambiguation

We present an unsupervised learning strategy for word sense disambiguation (WSD) that exploits multiple linguistic resources including a parallel corpus, a bilingual machine readable dictionary, and a thesaurus. The approach is based on Class Based Sense Definition Model (CBSDM) that generates the glosses and translations for a class of word senses. The model can be applied to resolve sense amb...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1987

The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English

نویسندگان

چکیده

منابع مشابه

Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE

Acquisition Of Computational-Semantic Lexicons From Machine Readable Lexical Resources

A Class-based Approach to Word Alignment

Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation

Class Based Sense Definition Model for Word Sense Tagging and Disambiguation

عنوان ژورنال:

اشتراک گذاری